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ABSTRACT 

A new method is presented to compute age estimates from theoretical isochrones 
using temperature, luminosity and metallicity data for individual stars. Based on 
Bayesian probability theory, this method avoids the systematic biases affecting simpler 
strategies, and provides reliable estimates of the age probability distribution function 
for late-type dwarfs. Basic assumptions about the a priori parameter distribution 
suitable for the solar neighbourhood are combined with the likelihood assigned to the 
observed data to yield the complete posterior age probability. This method is especially 
relevant for G dwarfs in the 3-15 Gyr range of ages, crucial to the study of the chemical 
and dynamical history of the Galaxy. In many cases, it yields markedly different results 
from the traditional approach of reading the derived age from the isochrone nearest to 
the data point. We show that the strongest effect affecting the traditional approach is 
that of strongly favoring computed ages near the end-of-main-sequence lifetime. The 
Bayesian method compensates for this potential bias and generally assigns much higher 
probabilities to lower, main-sequence ages, compared to short-lived evolved stages. 
This has a strong influence on any application to galactic studies, especially given 
the present uncertainties on the absolute temperature scale of the stellar evolution 
models. In particular, the known mismatch between the model predictions and the 
observations for moderately metal-poor dwarfs (—1 < [Fe/H] < —0.3) has a dramatic 
effect on the traditional age determination. 

We apply our method to the classic sample of Edvardsson et al. (1993), who derived 
the age-metallicity relation (AMR) of a sample of 189 field dwarfs with precisely 
determined abundances. We show how most of the observed scatter in the AMR is 
caused by the interplay between the systematic biases affecting the traditional age 
determination, the colour mismatch with the evolution models, and the presence of 
undetected binaries. Using new parallax, temperature and metallicity data, our age 
determination for the Edvardsson et al. sample indicates that the intrinsic dispersion 
in the AMR is at most 0.15 dex and probably lower. In particular, we show that old, 
metal-rich objects ([Fe/H] ~ 0.0 dex, age > 5 Gyr) and young, metal-poor objects 
([Fe/H] < —0.5 dex, age < 6 Gyr) in many observed AMR plots are artifacts caused 
by a too simple treatment of the age determination, and that the Galactic AMR 
is monotonically increasing and rather well-defined. Incidentally, our results tend to 
restore confidence in the method of age determination from chromospheric activity for 
field dwarfs. 

Key words: methods: statistical - Hertzsprung-Russell diagram - stars: evolution - 
stars: fundamental parameters (ages) - Galaxy: evolution 



1 INTRODUCTION 

Isochrone ages for field dwarfs - Bayesian approach 
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Theoretical stellar evolution models have proved spectac- 
ularly successful at explaining the position of stars in the 
colour- magnitude diagram (CMD), as a function of only 
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three parameters: mass, age and metallicity. Comparison of 
the mean sequences of open and globular clusters with theo- 
retical isochrones forms the basis of age estimations in astro- 
physics. Although second-order discrepancies subsist such as 
the subdwarf locus, the slope of the red giant branch, the 
width of the main sequence or the distance of the Pleiades, 
the method has now been improved to the degree of reaching 
a relative accuracy better than 10 percent for cluster ages 
from theoretical isochrones (see e.g. Rosenberg et al. 2002 
for globular clusters). 

In principle, this method can also be applied to individ- 
ual stars. The model predictions can be interpolated to as- 
sociate a given star's observed parameters, by inverting the 
relation given by the models between the physical param- 
eters (mass, temperature, abundances) and observable pa- 
rameters such as luminosity, temperature, metallicity, colour 
and magnitude. In particular, because late-F and G dwarfs 
have lifetimes comparable to the age of the Galaxy, deriving 
individual ages for field F and G dwarfs in the solar neigh- 
bourhood is of crucial importance to study the chemical and 
dynamical history of the Galactic disc. In practice, however, 
deriving ages for late-type dwarfs turns out to be difficult. 
For the ages typical of galactic populations, from ~ 1 Gyr 
up to the maximum age of stars in the Galaxy, the model 
isochrones are separated only by small distances in observ- 
able parameter space on and near the main sequence, and 
high accuracy on the temperature, distance and metallicity 
determinations, as well as high confidence in the absolute 
temperature and metallicity scales of the models and obser- 
vations, are needed to obtain meaningful results for individ- 
ual ages. 

One landmark study in that field is Edvardsson et al. 
(1993, hereafter E93), who obtained very accurate multi- 
element abundances for 189 field F and G stars and com- 
puted ages spanning the whole lifetime of the Galactic disc. 
They derived the ages for individual stars in their sample 
by comparison with Vandenberg (1985) isochrones. Their 
results have subsequently been recomputed with more re- 
cent Bertelli et al. (1994) isochrones by Ng & Bertelli (1998). 
Chen et al. (2000) have added multi-element metallicity data 
for 90 more disc stars, and Bensby, Felzing & Lundstrom 
(2003) for 66 stars. 

Following the availability of Hipparcos parallaxes for 
most nearby F and G dwarfs, ages have been derived for 
much larger sets of data (e.g. Asiain et al. 1999; Fcltzing et 
al. 2001; Ibukiyama & Arimoto 2002). The publication of 
the large Geneva-Copenhagen solar-neighbourhood survey 
(Nordstrom et al. 1999, hereafter GCS), with metallicities, 
distances and Stromgren photometry for more than 16'000 
local F and G dwarfs, is likely to prompt more such studies 
in the near future. 

Another important method to derive ages for field F-G 
dwarfs is the use of chromospheric activity as an age indica- 
tor (e.g. Kraft 1967, Noyes et al. 1984). It has been applied 
to a large sample of disc F and G dwarfs by Rocha-Pinto et 
al. (2000). The results of the isochrone and chromospheric 
age estimates are not showing satisfactory agreement. 

Deriving age estimates from isochrones for individual 
stars is an inverse problem. The tracks calculated from the- 
oretical evolution models define a function Y — ^(X) relat- 
ing the input physical parameters X (age, mass, abundance) 
to the observable parameters Y (e.g. temperature, luminos- 



ity, metallicity). The objective is to derive estimates of the 
physical parameters X from the observed Y . 

All studies quoted above have estimated isochrone ages 
for individual stars by selecting the isochrone nearest to the 
object in data space, i.e. computing J r ^ 1 (Y). The uncertain- 
ties affecting the resulting ages are estimated by probing the 
T^ 1 around T l (Y) according to the uncertainties affecting 
Y. 

However, in practice, there are two conditions for the 
inverse function JF -1 to provide an unbiased estimator of 
the real X: 

(i) the function must be reasonably linear within the 
uncertainties on Y . 

(ii) the uncertainties on Y must be much smaller than 
the variations in its a priori probability distribution. 

These two conditions are generally satisfied for the age 
determination of early-type stars at the bright end of the 
main sequence, with ages in the range 0-3 Gyr. In this 
regime, the observational uncertainties are generally smaller 
than the range over which the isochrones curve, and over 
which the a priori density varies. It is much less clear that 
the conditions are respected for later-type dwarfs, with ages 
in the 3-15 Gyr range. In this range, condition (1) is not sat- 
isfied, because the isochrones are densely packed and highly 
curved within the observational uncertainties. Condition (2) 
is not satisfied either: under any reasonable assumption, the 
a priori distribution of the observables can vary enormously 
within the uncertainties on Y. For instance, varying the lu- 
minosity within one or two sigma can move an observation 
from the main-sequence into the Hertzsprung gap, where the 
a priori density is an order of magnitude lower, or even be- 
low the ZAMS, where the a priori density is zero. Therefore, 
the isochrone age from the simple inversion of the function 
defined by the evolution models can suffer from significant 
systematic biases. Because those objects are the most in- 
teresting to study the history and evolution of the Galaxy 
(age range 3-15 Gyr), it is important to derive realistic and 
unbiased age estimates for them. 

When condition (1) is not satisfied, one way to obtain 
an unbiased estimator is to consider the complete probabil- 
ity distribution function ('pdf ') of Y given its uncertainties, 
prob(Y), and to compute the corresponding distribution of 
prob i^ 1 (Y)) • The resulting probability distribution func- 
tion is known as the likelihood. Methods based on the like- 
lihood can provide reliable estimators when the function re- 
lating the observed values to the unknown parameters is 
highly non-linear. 

When condition (2) is not satisfied, even estimators 
based on the likelihood will be biased. One common way to 
deal with the problem is to build simulations of the whole 
procedure, estimate the systematic biases from the simula- 
tions, and then correct the results with a posteriori compen- 
sations for the biases. However, Bayesian probability theory 
offers a much more robust method to obtain unbiased es- 
timates in that case. The Bayesian approach includes both 
the likelihood and the a priori distribution of the parame- 
ters to compute the complete, unbiased posterior probability 
distribution. 

When possible, fully Bayesian analyses are often 
avoided because of their large demand in computing time 
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and their conceptual difficulty. In recent years however, their 
use has become more and more frequent, following both the 
increase in computing power and the development of the 
theory. For a general introduction to Bayesian data analysis 
see Sivia (1996), for a very clear and detailed presentation 
see Jaynes (2003), and for a rapid overview centered on as- 
trophysical applications see Loredo (1990). In this paper, 
we assume that the notations and concepts of probability 
theory are familiar to the reader. We are mostly using no- 
tations as in Sivia (1996). Cases that necessitate a Bayesian 
analysis are rare. In general, measurement uncertainties are 
much smaller than the range of the a priori distribution. 
Bayesian analyses are necessary in the case of high relative 
errors on single-case estimates. They have been used to de- 
rive the most likely value of the cosmological parameters 
(e.g. Slosar et al. 2003) and to analyse the neutrino data of 
SN1987A (Loredo 1990). Another example is the case of the 
study of the bias affecting trigonometric parallaxes (Lutz & 
Kelker 1973): In the wake of the Hipparcos astrometric satel- 
lite mission, an extensive literature has appeared on the sub- 
ject (Oudmaijer, Groenewegen & Schrijver 1998; Pont 1999; 
Reid 1999; Arenou & Luri 2002; Smith 2003). In that case, 
it was found that with high-relative-error parallax measure- 
ments (a m /zu >~ 0.2), straightforward statistical methods 
brake down, and the effect of the prior begins to become so 
dominant that hardly anything meaningful can be derived 
about the value of the distance independently of the a priori 
assumptions. It is now accepted that a limit of ~10 percent 
has to be put on a m /vo to derive robust distance estimates 
from trigonometric parallaxes independently of further as- 
sumptions. The Lutz-Kelker effect is a typical Bayesian ef- 
fect showing when the shape of the prior has to be taken 
into account. 

In this paper, which is intended both as a reconsider- 
ation of previous data and as a preparation for the coming 
analysis of new large samples, we point out fundamental fea- 
tures in the analysis of ages for individual stars that have 
been overlooked by all previous studies that we are aware 
of, and that can profoundly affect the results. We examine 
in what way the age determination can be improved us- 
ing probability theory. We propose a method to compute 
the Bayesian age probability distribution for field stars and 
compare them to likelihood estimates (Part One). We then 
focus on the E93 sample as an illustrative application (Part 
Two). 

The E93 study and the Galactic age-metallicity relation 

The E93 sample of 189 stars in the solar neighbourhood still 
provides one of the most accurate and extensive bases for 
the study of the chemical evolution of the Galactic disc. In 
particular, many authors subsequently accepted their age- 
metallicity plot (repeated in Fig. but see companion fig- 
ures) and their interpretation of a very substantial spread of 
[Fe/H] for a given age in the solar neighbourhood. The plot 
shows a large intrinsic scatter at ages between 3 and 10 Gyr, 
with a standard dispersion of ~ 0.24 dex and almost no age- 
metallicity relation in this range. The age-metallicity plot 
for their sample is often presented as "the age-metallicity 
relation of the Galaxy" and a dispersion of the order of 0.25 
dex - corresponding to a total range of ~ 0.6 to 0.8 dex 
- in metallicity at a given age has been taken as an ob- 



servational requirement that the models of the Galaxy are 
requested to meet (e.g. Carraro et al. 1998) . However, more 
recently, Garnett & Kobulnicky (2000) have revealed an im- 
portant dependence of the metallicity dispersion with dis- 
tance in the E93 sample, indicating that the age metallicity 
relation (AMR) from E93 may have been affected by strong 
systematic biases. 

Rocha-Pinto et al. (2000) have studied the dispersion of 
the AMR with age determinations based on chromospheric 
activity. Their ages are only weakly correlated with the 
isochrone ages, and they find a low intrinsic dispersion of 
the AMR. This result is particularly significant given the 
statistical implausibility of decreasing the dispersion of the 
AMR with added uncertainties. 

The age-metallicity relation is one of the most impor- 
tant observational constraints on models of the evolution of 
the Galaxy. It expresses how stellar formation has enriched 
the interstellar medium over time, and therefore depends on 
the star formation rate, the chemical yields, the efficiency of 
recycling, infall and outflow, and the amount of mixing in 
the gas. Because the shear induced by differential galactic 
rotation spreads stars and gas around all galactic azimuths 
in a few rotations, theoretical models commonly assume a 
metallicity depending only on time and galactocentric radius 
(e.g. Chiappini et al. 1997). Several observations support 
this assumption. The inhomogeneities in the abundance of 
the interstellar medium are small (Kobulnicky & Skillman 
1996; Meyer, Jura & Cardelli 1998). R ~ Ro Cepheids show 
a low abundance dispersion ( Andrievsky et al. 2002) ; young 
open clusters show an abundance dispersion lower than 0.20 
dex (Twarog 1980; Carraro et al. 1998). The indication from 
older open clusters is more ambiguous (Twarog 1980; Piatti 
et al. 1995; Carraro et al. 1998; Friel et al. 2002), show- 
ing a large scatter that can be mostly attributed to a ra- 
dial metallicity gradient. Kotoneva et al. 2002a find that 
only a modest intrinsic scatter in the AMR is needed to fit 
the solar neighbourhood K-dwarfs data. In external spiral 
galaxies, the abundance dispersion of young features at a 
given galactocentric distance is typically much smaller than 
0.2 dex (e.g. Kennicutt & Garnett 1996 for HII regions in 
M101). 

In this context, the E93 result comes as a surprise, and 
a very difficult requirement to be met by the models. E93 
computed the present Galactic orbits of their objects, and 
concluded that only a small part of the observed metallic- 
ity dispersion was due to orbital diffusion i.e. the fact that 
metal-richer and metal-poorer stars born at lower or higher 
galactocentric radii cross the solar neighbourhood. The re- 
maining dispersion, covering a total range of the order of 
0.6-0.8 dex at a given age, was taken to be the indication 
of a very high dispersion of the metallicity of the gas at a 
given time and galactocentric radius (or even a complete lack 
of correlation between age and metallicity, e.g. Feltzing et 
al. 2001). Such a result implies a very inefficient azimuthal 
mixing of the gas in the disc, or extremely frequent infall 
episodes, with pockets of gas of very different metallicities 
sharing a similar radius at a given time. 

An alternative explanation for a high intrinsic disper- 
sion was explored by Selwood & Binney (2002) and Lepine 
et al. (2003), who showed that radial migration of the stel- 
lar orbits without conservation of the angular momentum, 
for instance under the influence of spiral arm perturbations, 
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could move stars in galactocentric radius by several kilo- 
parsecs within the Galactic disc. This could bring together 
stars of the same age on similar orbits but formed at very 
different galactocentric radii, therefore with very different 
metallicities because of the radial metallicity gradient in the 
disc of the Galaxy. 

In the second part of this article, we apply our age de- 
termination to the E93 sample as an illustration of our ap- 
proach to the age determination, and conclude that - as cor- 
rectly anticipated by Garnett & Kobulnicky (2000) - most 
of the "cosmic" dispersion in the AMR of E93 is due to un- 
certainties in the ages, and that the data actually indicate 
a cosmic dispersion of less than 0.15 dex. 



PART ONE: AGE ESTIMATES FROM 
ISOCHRONES FOR INDIVIDUAL LATE-TYPE 
STARS 

2 THEORETICAL BASIS 

2.1 Confrontation of the standard and Bayesian 
approach 

The standard approach 

Stellar evolution models define a function relating physical 
parameters to observable quantities: 

Y = F(X) 

where X are the physical input parameters, namely mass, 
age and abundance: X = (m,t,z), and Y are the ob- 
served quantities - observed or inferred directly from ob- 
servations - e.g. temperature, luminosity and metallicity: 

Y = (T,L,[Fe/H]). 

The standard approach to computing the age of an indi- 
vidual star from theoretical isochrones is to interpolate the 
stellar evolution tracks to find which age and mass value cor- 
respond to the same point as the star in the (T, L, [Fe/H]) 
space. 

Interpolation between the models is needed to yield a 
value of T for all (m, t, z) triplets 1 . Given the observed val- 
ues T obs , iobs and [Fe/H] obs , the standard approach thus 
inverts the relation T to find 

(m ,t ,z ) = J r ~ 1 (T ohs ,L ohs , [Fe/H]obs) 

where the " o" subscript denote the values for the considered 
object. 

The function T is not strictly bijective because 
isochrones do sometimes cross each other in the Y space. 

can be uniquely defined nevertheless by considering, 
when it is multiply defined, only the stage that is more slowly 
evolving. T~ l , of course, is also undefined in the large por- 
tion of the Y space that do not contain any evolution tracks. 

A simple way to estimate the uncertainties on the t age 
obtained by the inversion of the T function is to calculate 



1 In this section wc assume that for a given object, values of 
Tobsi Tjbs an d [F c /H]obs are obtained from the observations (the 
transformations from observed colours, magnitudes and parallax 
to T Q bs an d iobs a re considered reliable). The reasoning would 
be exactly the same if we use colour and magnitude instead of 
temperature and luminosity. 



the value on T 1 found by moving the data point according 
to the observational errors: 

(rn± tT ,t± a ,z± a ) = J r ~ 1 (T ohs ±a T , L obs ±a L , [Fe/H] bs±ff[Fe/H]) 

either one at a time or all at the same time. 

A slightly more sophisticated approach is to compute 
over the whole (logT, L, [Fe/H]) space, and to assign 
to each point the probability given by the distribution func- 
tion of the observational uncertainties. For instance, if the 
observational uncertainties are described by Gaussian func- 
tions with dispersions a[ Fe / H ], o"io g T and t7i og i, then the re- 
covered age distribution function is based on the likelihood 
function: 

£(T,L,[Fc/H]) = 



<T[Fe/H] <?T Oh (2n) 3 / 2 



x exp ■ 



x exp 



- (log Tobs - log T) 2 

2cr iogT 



exp • 



-(log Tobs - logT)^ 

2(T lo S L 



-([Fe/H]obs - [Fe/H]) 2 



2a\ 



(1) 



Fe/H] 



This likelihood is the conditional probability of a point 
being observed at (T b s ,L b s , [Fe/H] (, s ) given a true value 
of (T, L, [Fe/H]), or C (observed, true) = prob(observed|true) 
where the "|" symbol denotes conditional probabilities. The 
terms on the right result from the Gaussian distribution of 
the uncertainties. Instead of simply inverting the T function 
at the value of the data point, an age pdf can be obtained 
from the histogram in age of the likelihood over all possible 
ages: 

L t (t) = / C(T, L, [Fe/H]) d[Fc/H] dTdi 
Jr 

where R is the region in (T, L, [Fe/H]) space where the 
JF _1 (T, L, [Fe/H]) = t. The maximum of C t can be used 
as an estimator (maximum likelihood method). 



Bayes' theorem 

However, prob(observed | true) is not really the probability 
that one is trying to determine when performing a mea- 
surement. One is not attempting to estimate the probability 
of the observed value assuming the true value, but indeed 
the value of the true quantity, given the observation, i.e. 
prob (true | observed) . 

These two quantities are related through Bayes' theo- 
rem: 

prob(D|/f) 



pmb(H\D) 



prob(D) 



• prob(ff ) 



(2) 



where H can be any set of hypotheses (in our case the true 
age value) and D the observed data. The term on the left is 
called the posterior probability (the probability of H given 
D, which is what one wants to determine), the numerator 
on the right is the likelihood (the probability distribution 
of D assuming H, or the "likelihood" of observing D if H 
is true). prob(TT) is the prior probability (what was known 
about H a priori), and prob(D) is a normalizing factor inde- 
pendent of H (that can be ignored for our purposes). There- 
fore, according to Bayes' theorem, the condition for the like- 
lihood to be a good estimator of the posterior pdf is that the 
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prior pdf can be neglected. Then Bayes' theorem becomes 
prob(H|£>) ~ pmb(D\H) = £(D, H). 

The basic criterion to determine whether the prior prob- 
ability distribution can be neglected in a given problem is 
to compare the scale of variation of the prior pdf with the 
scale of the observational uncertainties. If the uncertainties 
are much smaller than the scale over which the prior varies, 
then the likelihood "overwhelms" the prior. This is the case 
for instance when the magnitude of a star is measured with 
an accuracy of, say, 0.01 mag. The prior pdf can vary from 
case to case, but a natural prior is to assume nothing on the 
magnitude, allowing for the star to be located at random 
in a 3-D space, which implies for the prior a dependence 
in io°' 6 ' Am ", a variation of a factor ~1.4 percent for each 
0.01 magnitude. Thus the variation of the prior is negligible 
compared to the variation of the likelihood (~ 30 percent 
on a — 0.01 mag for a Gaussian distribution). 

In some cases though, the prior pdf cannot be neglected. 
The prior probability varies a lot over the span of the ob- 
servational uncertainties. In other words, previous knowl- 
edge indicates that one part of the likelihood distribution 
is much less probable than the other. For example, imagine 
measuring with very low accuracy the magnitude of a star 
picked up at random from the HD catalogue, and obtaining 
mv = 7.0 ± 1.0 mag. The likelihood is a Gaussian, N(7, 1). 
It would not result however that 6.0 and 8.0 are equally 
likely for the star's actual magnitude, given the fact that 
faint stars are more numerous than bright stars by a factor 
jg0.6-Am„ m & magnitude- limited survey (that is the prior 
prob(//)). After the measurement, mv > 7 is a much better 
estimate of the true magnitude, prob(i?|_D), than mv < 7. 
At the other extreme, values my ~ 9 are much less likely 
than indicated by the function N(7, 1), because the HD cat- 
alogue has a magnitude limit mv ~ 8.5. In this example the 
likelihood was not peaked enough to "overwhelm" the prior, 
and the posterior pdf obtained through Bayes' theorem re- 
sembles the prior more than the likelihood. 

Another, more astrophysically relevant, such situation 
is the case of trigonometric parallaxes with high relative 
errors (see Introduction for references). The prior distribu- 
tion of parallaxes for a single object is very steep: knowing 
that parallaxes are the inverse of distances and that space 
is 3-dimensional, it can be inferred that lower values of the 
parallax zu are more likely a priori by a factor zu~ 4 for a 
given object. Only if the parallax error is much smaller than 
the range over which this factor changes (a^jvj <~ 10 per- 
cent) does the prior knowledge become irrelevant, and one 
can proceed to derive an unbiased estimate of the distance. 

When the conditions for neglecting the prior are not 
satisfied, one way to proceed is to use likelihood estimators 
anyway, and to deal with the biases introduced by neglecting 
the prior with methods like weighted statistical indicators, 
ad hoc empirical corrections, or bias corrections from Monte 
Carlo simulations. This is often the only alternative when 
the mathematical structure of the problem is complex and 
there is no clear way to characterize the prior prob(ff). 

However, equation provides the means to calculate 
the correct unbiased posterior pdf if a functional form of the 
prior can be given. The prior is in general not known exactly, 
but that is often not necessary. A reasonable approximation 



is generally sufficient 2 . In the case of parallaxes, assuming 
that space is 3-dimensional and Euclidean is enough to give 
a good prior for the analysis of the Lutz-Kelker bias. In the 
case of the ages, reasonable assumptions on the prior can be 
made, for instance a flat age distribution and some power- 
law mass distribution. 

It is important to remember that although the depen- 
dence of the posterior pdf on the prior pdf could be taken 
as a drawback of the Bayesian approach, the likelihood es- 
timates are independent of the prior only in appearance. In 
reality, they make a much more obviously invalid -hidden- 
assumption about the prior by implicitly assuming a flat 
prior in the space of the observable data. In the case of 
the ages, that means assuming that the HR-diagram is uni- 
formly filled, which is obviously very far from true. This 
hidden assumption is of no consequence in cases when the 
experimental errors are very small compared to the changes 
in density in the HR-diagram. However, for late-type dwarfs, 
that is not the case. The whole width of the main sequence 
is only a few times the size of the observational errors, and 
the a priori density varies a lot within the error intervals. In 
that case even a low-information prior like a flat age distri- 
bution is much better than a flat prior distribution in data 
space implicit in likelihood methods. 



Bayesian age estimates from isochrones 

Fig. Q illustrates a representative example for the solar- 
neighbourhood: deriving ages from the observed colour and 
magnitude is done by comparison of the data with isochrones 
from theoretical evolution tracks. Let us ignore the metal- 
licity dimension for the time being. The background dots 
in Fig. display a typical distribution of stars in the HR- 
diagram for a magnitude-limited sample of the Galactic 
disc (from GCS), with a dense main-sequence and sparsely 
populated Hertzsprung gap, and a superposition of many 
"turnoffs" due to the mixture of stars with a wide range 
of ages. The error bars on the measured parameter are 
shown for a single point. This point is located on the 10 
Gyr isochrone and therefore has J-" _1 (r, L)=10 Gyr. 

Focusing on this data point, we first note that the obser- 
vational errors are large compared to the regions where the 
T function can be linearised. This implies that the descrip- 
tion of the age probability distribution by a central value 
and a single error bar obtained by the propagation of the 
uncertainties on the observed parameters T and L will not 
be a good representation. In particular, the age pdf can be 
expected to have a wide tail towards lower age values. Al- 
though the one-sigma interval remains within t = 10±2 Gyr, 
the 3-sigma interval reaches t = Gyr. This asymmetry can 
be taken into account by computing the complete age likeli- 
hood pdf £.(t) = P(T bs, L bs\ t), plotted as a dotted line in 
the lower panel of Fig. Q 

However, as was reminded in the previous section, the 



2 If it is not, it means that the result will be more sensitive to 
the prior that to the likelihood. In plain English, more sensitive to 
what was known a priori than to the measurement. In that case 
(to paraphrase Loredo 1990), maybe one should consider getting 
better measurements! 
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likelihood can still be a biased estimator of the actual prob- 
ability distribution of the real age if the uncertainties do not 
make the influence of the prior pdf negligible. If our point 
is a random representative of the sample it was picked up 
from, then the prior pdf resembles the density of background 
points in the (T, L) plane (see Section 13.21 on more details 
on the computation of the prior). The prior is seen to vary 
greatly within the span of two times the observational er- 
rors or less, dropping by a large factor when moving from 
the main-sequence zone to the subgiant zone (Hertzsprung 
gap). Clearly, the conditions for using the likelihood as an 
estimator are not satisfied and any estimator based on the 
likelihood only will be biased. 

The fundamental reason for this is the acceleration 
of stellar evolution after the main sequence phase. A star 
spends much more time on the slowly-evolving part of the 
main sequence than on the rapidly evolving subgiant zone. 
Therefore, a mixed-age population will be much more dense 
on the main-sequence than above it. As a consequence any 
point observed above the main sequence, with error bars 
that are of the order of the distance separating it from the 
slow-evolving zone, has considerable probabilities of being 
actually located on the main sequence and brought where it 
is observed by observational errors ("contamination"). The 
likelihood does not take this into account and gives equiva- 
lent probabilities to positive and negative errors. The prior 
term prob(LT) of Bayes' Theorem is what allows this to be 
taken into account. 

The lower panel of Fig.Qcompares the Bayesian poste- 
rior pdf prob(t\T bs, L bs) for our sample point to the distri- 
bution of the likelihood. The age probability is significantly 
shifted towards the slow-evolving main sequence. Values be- 
tween and 5 Gyr, that were practically excluded by the 
T~ x and likelihood estimators, now have significant proba- 
bilities, and the median of the pdf is shifted from 10 to 7.5 
Gyr, implying a systematic bias of 2.5 Gyr or 25 percent. 

In this 2-D illustration we did not take the metallicity 
into account. Note however that uncertainties on the metal- 
licity are also of the same order of magnitude as the expected 
variations of the metallicity prior pdf, so that likelihood- 
based estimates will also suffer from metallicity-dependent 
biases (see Section l3.21 . In the next section, we consider the 
issue analytically in the even more simplified 1-D case, be- 
fore moving to a more complete calculation in Section \l. 31 



2.2 Simplified 1-D approach 

By reducing the dimensionality of the problem, we can il- 
lustrate some fundamental statistical features of the age de- 
termination. Les us assume here that the age, t, is com- 
puted from a single observable parameter, the logarithm of 
luminosity log L. The objective is to estimate the probabil- 
ity distribution of the real age, t, given the observed value 
log L b s and the transformation law logL = J~{t). This is 
done through Bayes' theorem: 



prob (t j logL obs ) = 



prob(logL obs | t) 
prob (log £ obs ) 



prob(i) 



Figure 1. Representative example of a colour-magnitude dia- 
gram for dwarfs in the solar neighbourhood (from GCS) with 
metallicities near solar. The isochrones from Girardi et al. (2000) 
for z=0.02 are overlaid. Typical observational uncertainties are il- 
lustrated for one data point in the post-main sequence zone. The 
age probability distributions for this point are given in the lower 
panel. Dashed line: Gaussian distribution with 10±2 Gyr. Dotted 
line: likelihood pdf for the sample point. Solid line: posterior age 
pdf. 
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Without prior knowledge on the real age, we assume that 
the prior probability prob(t) has a flat distribution between 
and t max . The difficulty is that the likelihood is expressed 



in the parameters of the observables, while the prior is ex- 
pressed in terms of the physical parameter t. Since Bayes' 
theorem has to be expressed in a coherent set of variables, 
the probability distribution functions have to be modified 
accordingly (using the transformation !F). In one dimension, 
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this is done by the chain rule: 

prob(log L) d log L — prob(logL) | ^ - | dt 

where prob(log L) should be expressed in the variable t. 

An actual example of the transformation between t and 
log L is shown in Fig. [2] The T relation at solar temperature 
and metallicity can be approximated by two linear relations 
with a strong change of slope at the end of the main se- 
quence: 



\o % L = T{t) = 



at + 7 



if < t < t 



P(t - t ) + at + j if to <t< i majt 

0.15, 7 



with /? > a. In our example, a = 0.015, j3 
-0.058, to = 9.2 Gyr and i max = 11 Gyr. 

We further assume that the noise measurement on log L 
has a Gaussian probability distribution with standard devi- 
ation (Jlog l ■ 

In this case the likelihood is expressed as: 



prob(log L obB | log L) 



Clog L 



exp - 



(logL ob3 -logL) 2 

2 <L 



The uniform prior probability distribution of the age t is 
transformed, using the chain rule, into the prior probability 
distribution expressed in log L: 



prob(logL) 



Cl 
C2 



if log L < log Lo 
if log L > log Lo 



With ci = ((3/a)c2 and logLo = J-(to)- The prior expressed 
in log L is therefore a step function. 

Finally, the posterior pdf can be expressed in the age 
variable, t: 



prob(i|i oba ) = 



-J^exp- 



/2tt 



("*+7-l°g£ obs ) 

2^ 

(/3(t-t ) + a to + 7-log£ ob3 ) 2 
2^ 





where 



if < t < to 
if t <t < t„ 
otherwise 



Let us consider a measurement obtained at the value 
log L bs=J-{t — 8.5 Gyr). Fig. [3] shows the corresponding 
prior, likelihood and posterior probability distributions in 
terms of log L and of t. 

Several effects are apparent: 

(i) The transformation laws jT _1 (logL) and T(t) can 
qualitatively modify the probability distributions. Even if 
the likelihood pdf has a Gaussian shape in the variable log L, 
it can be drastically different when expressed in age. 

(ii) In the literature, the quoted <r_ , cr+ around the esti- 
mated age are often the simple transformation of the two val- 
ues meaniog l — criog t and mean\ os l + o"i og l of the likelihood 
function through J r ~ 1 . These are not necessarily directly re- 
lated to the quantiles or standard deviation of the posterior 
age pdf. The posterior pdf may have a non-Gaussian shape, 
its mean value, moments and quantiles may be all modified 
by the prior and by the transformation law T. In the exam- 
ple of Fig.[3]the posterior pdf has become very asymmetric. 
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Figure 2. Solid line: Transformation law logL = T(t), i.e. 
the logarithm of the luminosity, log L, as function of the age, t, at 
fixed solar temperature (5780 K) from Geneva stellar evolution 
models (Schaller et al. 1992). Dashed line: simple modelisation 
of T in two regimes. The slope of T, low below 9.2 Gyr (main- 
scqucncc phase), becomes much higher above 9.2 Gyr (evolved 
phase). The ratio between the speed of evolution (ratio of the 
slopes) is about a factor 10. 



(iii) The prior pdf expressed in the variable log L is un- 
evenly distributed, with a lower probability for log L for ages 
higher than 9.2 Gyr. This results in a strong decrease of the 
posterior pdf compared to the likelihood for ages higher than 
9.2 (see below). 

In this 1-D model, ages that are not calculated though 
Bayes' theorem are subject to a systematic bias that we call 
the "terminal age bias". "Terminal age" refers to the age for 
which the T relation changes slope, roughly corresponding 
to the end-of-main-sequence lifetime. Fig. 0] illustrates this 
bias by plotting the age T~ x (log L) against the real age for a 
randomly drawn sample with < t < t max - The histogram 
of the likelihood ages is given for the real age interval 6 < 
t < 7, showing the strong excess of the likelihood ages near 
the terminal age. 

The biased nature of the likelihood-based method is ap- 
parent. Maximum likelihood gives the "best" solution in the 
sense of taking the most probable value of the likelihood 
function. However, it does not take into account the fact 
that some values of log L are more likely a priori due to the 
shape of the T function, in our example that contamination 
from the slower-evolving main-sequence is important in the 
subgiant zone. 

When using a maximum likelihood method, systematic 
biases have to be evaluated with additional Monte-Carlo 
simulations, and the estimator corrected if necessary. Ad- 
ditional knowledge of the system that was not used to com- 
pute the statistical estimator is added in the Monte-Carlo 
simulations. 

Bayesian methods take a more direct approach by inte- 
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Figure 4. (a) Uniform random sample of true ages, as a function 
of the age determined with the direct method J? 7-1 (log L). The 
discontinuity of the distribution of ages is clearly visible at the 
age corresponding to the end of the main sequence. We call this 
phenomenon the terminal age bias, (b) Histogram of ages from the 
direct method with true ages between 6 and 7 Gyr, as indicated 
in (a). 




Age [Gyr] 

Figure 3. Upper panel: Prior of uniform age (dotted line), like- 
lihood (dashed) and posterior (solid line) probability distribu- 
tion functions, all expressed in log L. Lower panel: Prior (dotted 
line), likelihood (dashed line) and posterior probability distribu- 
tion (solid line) functions expressed in age. 



grating all the knowledge about the system in the posterior 
pdf. This makes the results sensitive to the underlying as- 
sumptions, but removes strong systematic biases from the 
posterior pdf. 

2.3 Extension to 3-D and Monte Carlo integration 

We can now extend the previous discussion from 1 to 
3 dimensions. Let us consider again the stellar evolution 
models as a function T relating the "physical" parame- 
ters X = (m,t,z) to the "observational" parameters Y = 
(T, L, [Fe/H]), with three components: 

C T = T T {m,t,z) 

Y = P(X)<*\ L = T L (m,t,z) 

{ [Fc/H] = \og(z/z ) 

Given an observed data triplet, 

Y = (T ohs ,L ohs ,[Fe/H] obs ) 

we want to calculate the conditional probability of X, 
prob(X|y o ), in particular the age conditional probability 
prob(a|Y ) for all possible ages a. 
According to Bayes' theorem: 

prob(X | Y ) ~ prob(X) • prob(Y ] X) 

Using the marginalization theorem , we integrate over mass 
and metallicity, to find the probability that the real age is 
equal to a given value to: 



3 the Marginalization theorem states that prob(A|B) = 
J* prob(A, C|.B)dC 
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prob(t \Y o ) 



prob(X)prob(Y jX)dmd2 (3) 



where the integral is done over the region R defined in the 
(m, t, z) space by the condition t = to- 

prob(Y D i X) is the likelihood, C(Y a , F(X)). For in- 
stance, if the uncertainties on the observed parameters are 
Gaussian with dispersions criogT, o"i og L and Gypc/m, then 
the likelihood would be as in Eq. [T] with (T,L, [Fe/H]) = 
T(m, t, z). 

The term prob(X) is the prior probability distribution. 
It is the distribution expected for the parameters a priori, in 
terms of m, t and z. It can also be thought of as the density 
in the (m, t, z) space of an imaginary parent sample, and we 
shall therefore note this term like a density, p(X). 

In order to compute the integral the likelihood must 
be expressed in terms of (m,t,z). The change of variable 
from (T, L, [Fe/H]) to (m, t, z) is more complex than in the 
one-dimensional case and involves the Jacobian determinant 
of T: 



dT T /dt 
dT L /dt 
dT [Fc/H] /dt 



3Tt /dm 
8Tl /dm 
dT [F e /H] /dm 



dT T /dz 
dT L /dz 
d^[ Fe /H]/dz 



Then, 



prob(t \Y o ) 



p(X) C{Y ,F{X)) J(X) dmdz(4) 



R(t=t ) 



In practice, evaluating the Jacobian of the T function 
at all points of the 3-dimensional parameter space is a very 
time-consuming operation. 

Integral Q can be evaluated much more easily by 
Monte Carlo integration, which makes the change of vari- 
able unnecessary. In practice, only this approach can en- 
sure results within realistic computation times for the full 
3-dimensional model. 

A large sample of (m, t, z) triplets can be drawn follow- 
ing the density p(m,t,z), then the likelihood is computed 
for all triplets, and the results collected in age bins, i.e. 



prob(i | Y )dt 



E 

to — dt/2<t<t +dt/2 



C{Y ,T{m,t,z)) 



This method has the considerable advantage of requir- 
ing no inversion or differentiation of the T function, which is 
difficult and subject to many numerical instabilities, and is 
even undefined in many regions of parameter space (where 
stellar evolution tracks overlap and where no track passes) 
and of making the change of variable from (m, t, z) to 
([Fe/H],T, L) easy. It also allows great flexibility as to the 
assumptions on the prior. For instance, the inclusion of po- 
tential binarity becomes straightforward (see Section 13.31 . 
Another advantage is that building a random sampling of 
the p(m,t, z) density is equivalent to the more familiar pro- 
cedure of generating a synthetic stellar population, so that 
existing algorithms designed for this latter task can be used. 



3 PRACTICAL APPLICATIONS 

3.1 Realistic expressions for the likelihood 

For simplicity, we have made up to now the unrealistic as- 
sumption of Gaussian uncertainties on temperature and lu- 



minosity. In practice, the likelihood can be expressed with 
suitable assumptions on the distribution of the uncertain- 
ties on the colour, magnitude, logarithmic temperature or 
trigonometric parallax. For instance, one can assume Gaus- 
sian uncertainties on logT, [Fe/H] and the parallax w. In 
that case the likelihood becomes 



C(T obs , uj obs , [Fe/H] obs , m, t, z) = 



■ exp - 



(logr o b a - log TT{m,t,z))' 1 



C [Fe/H] CroClog T (27r) 3 / 2 

(TU obs - +2.51og^ (m ,M)/5+l ) 2 



x exp - 



x exp 



2<r| 



([Fe/H] of)s -log^) 



2a; 



Fe/H] 



where V the visual magnitude, and Ty the magnitude pre- 
dicted from the stellar evolution models. 



3.2 Choosing a prior 

Let us now build a realistic prior for the specific case of the 
solar neighbourhood. The prior distributions of m, t and z 
are assumed to be independent, i.e., no prior information is 
assumed on an age-metallicity relation, or on a time vari- 
ation of the mass distribution. In that case, the mass, age 
and metallicity prior can be considered separately: 

prob(y) = prob(m) • prob(t) ■ prob(z) 



The mass prior 

The mass prior can be chosen as one's favourite initial mass 
functions (IMF) derived for the Galaxy. Within reasonable 
limits, the precise choice of IMF will not have a strong influ- 
ence, because the mass range covered by the F and G dwarfs 
is not large. 



The age prior 

The age prior is the expected age distribution of all stars ever 
born in the sample considered (the fact that some of them 
have already died is accounted for in the T function), in 
other words the star formation rate (SFR) of the sample con- 
sidered. The SFR of the Galactic disc is not precisely known. 
It seems to have been globally constant or slightly decreas- 
ing (Hernandez, Valls-Gabaud & Gilmore 2000; Chang, Shu 
& Hou 2002; Vergely et al. 2002), but its small-scale struc- 
ture is still largely unknown. At this stage a flat age prior 
is a reasonable assumption. Decreasing priors can also be 
used. Within reasonable limits, the slope of the SFR does 
not make large differences in the recovered age pdfs. 

Note that using a flat age prior is not at all equivalent 
to ignoring the age prior. A flat prior in age is far from trans- 
lating into a flat prior in parameter space (see Section \2.1l . 
Fig-Elshows the prior distribution of magnitude at solar val- 
ues of temperature and metallicity resulting from a flat age 
prior. The abrupt slope towards bright magnitudes is due to 
the acceleration of evolution in temperature and magnitude 
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Figure 5. Magnitude probability distribution corresponding 
to a fiat age prior, with logT = 3.76 and [Fe/H] = 0.0. The 
sharp shape is due to the highly non-linear nature of the function 
F L (m,t,z). 




after the main-sequence phase. The cutoff at faint magni- 
tudes is obviously due to the absence of models below the 
zero-age main-sequence. 

For a flat or slightly decreasing age prior, an upper age 
cutoff must be chosen. This somewhat arbitrary procedure 
has a direct influence in the derived age distribution by sim- 
ply removing all ages above the cutoff. At present, the max- 
imum age of the stars in the Galactic disc is not well known. 
The age of the thin disc has been studied by Binney et al. 
(2000), but the solar neighbourhood also contains thick disc 
stars, whose age may be several Gyr higher than that of the 
oldest thin disc stars. 



Figure 6. Mctallicity distribution for the solar neighbourhood 
and metallicity prior. Stars: G-dwarf volume-limited mctallicity 
distribution according to j0rgensen (2000). Dotted line: Metal- 
licity distribution of the GCS survey. Dashed line: Metallicity 
prior adopted in the present paper. Insert: Gaussian probability 
distributions corresponding to dispersions cf\fc/h] =0.05, 0.10 
and 0.15 dex. 
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ibrations (e.g. Ibukiyama & Arimoto 2002, with <Jy Fe / H ^ ~ 
0.15 dex), the influence of the metallicity prior will become 
so important that the derived ages will be highly dependent 
on it and extremely uncertain. Maximum likelihood ages 
will be strongly biased, and will produce visibly unreliable 
results such as fig. 5 of Ibukiyama & Arimoto (2002). It is 
apparent from our Fig. HJthat (J[f e /h\ ~ 0.15 implies that 
the variation of the likelihood will not be steeper than the 
variation of the prior, which is the Bayesian definition of a 
"bad measurement". 



The metallicity prior 

In the case of the Galactic disc, a good metallicity prior 
is the expected distribution of a volume-limited sample of 
the solar neighbourhood. Two such distributions are dis- 
played on Fig. |5] The influence of the metallicity prior 
depends on the size of the observational uncertainties on 
[Fe/H]. For very high accuracy spectroscopic metallicities, 
with cr[ Fe /H] ~ 0.05 dex (e.g. E93), the likelihood is narrow 
enough to overwhelm the changes in the prior. For uncer- 
tainties most typical of larger surveys and of photometric 
metallicities, CT[p e /B] — 0.10 dex, the influence of the prior 
becomes more significant, especially in the regions where it 
is varying more rapidly: at the high-metallicity end and at 
the connection between the main thin-disc distribution and 
the thick-disc tail near [Fe/H] ~ —0.5. The systematic bias 
on likelihood-based methods can reach 0.1 dex in metallicity, 
causing systematic biases on the derived ages. 

When the metallicity uncertainty U[ Fe / H j is even higher, 
for instance when collecting metallicities from different cal- 



3.3 Accounting for undetected binaries 

Up to this point, we have considered that all stars in 
our sample obeyed the T relation between (m, t, z) and 
(T, L, [Fe/H]) with Gaussian uncertainty distributions. Even 
without considering such second-order effects as rotation or 
Helium abundance, this relation does not always hold for 
real stars, particularly in the case of undetected binaries. 
The light from the companion of an undetected binary can 
move a given object up to 0.75 mag (for equal-mass binaries) 
above its true position in the colour-magnitude diagram. 
This obviously has a profound effect on the age determina- 
tion. 

If the number of undetected binaries in the sample is 
not too large, its effect on the age determination for main- 
sequence stars will be manageable. A few of the ages will 
turn out to be overestimated. 

The effect of the binaries on the age determination of 
evolved stars in the subgiant zone, however, will be large. Be- 
cause the evolution is more rapid in the subgiant zone, the 
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Figure 7. Theoretical isochrones from the Padua models for 
[Fe/H] = -0.7 (0,1,2,3,5,10 and 15 Gyr) and objects in the GCS 
catalogue with —0.75 < [Fe/H] < —0.65. Detected binaries are 
indicated with open symbols. The error bars show the uncertain- 
ties due to the parallax. The dashed line is the solar-metallicity 
ZAMS. Part of the obvious temperature mismatch between mod- 
els and observations may be due to the slope of the metallicity 
prior. Taking the models and observations at face value would 
lead to assigning terminal ages to practically all stars. 
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probability of finding a star in a subgiant stage is much lower 
than for the main- sequence stage. On the other hand, un- 
detected binarity can move main-sequence objects upwards 
into the subgiant zone of the (T, L) plane. As a result, the 
contamination of binaries in the subgiant zone can be very 
important. 

The standard approach to the age determination offers 
no obvious way to deal with the binary contamination, which 
has to be treated as a nuisance and studied with separate 
simulations. 

In the Bayesian formulation, the inclusion of undetected 
binaries is not particularly difficult. A term can be added to 
equation © integrating the hypothesis that the object can 
be an undetected binary: 

prob(t\Y ) ~ prob(i) • prob(l^o|single or binary, t) 

Because binarity and non-binarity are mutually exclusive, 
the probability sum rule can be used to yield: 

prob(t\Y ) ~ prob(i, single)prob(Y" |t, single) 

+prob(i, binary)prob(y o |f, binary) 
If age and binarity are independent, then 

prob(t\Y ) ~ prob(t)(prob(y o |t, single) + q prob(Y \t, binary)) 

where q = prob(binary)/prob(single) is the rate of unde- 
tected binaries. 

In practice, because binarity has such a large effect on 
the age determination, one may be less interested in knowing 



the age pdf in the case of binarity than to know the total 
probability for the star to be a binary, "prob(binary | Vo)". 
To calculate this probability, we integrate over all values oft: 

P bin = prob(binaryjY ) = 
J prob(binary, t\Y )At = 

q J prob(t)prob("K jt, binary)dt 

The terms inside the integral can be calculated as in 
Section using the modified relation T' between (m,t,z) 
and (T, L, [Fe/H]) suitable for binaries, depending on the 
mass ratio parameter. 

The computations show that Pbin is small in some parts 
of the (T, L) plane and much larger in others, particularly 
0.75 mag above the main-sequence, as could be expected. At 
that position, it reaches about 10 times the value of q (im- 
plying that for an undetected binary rate of 10 percent, the 
object is actually as likely to be a binary as a single star). 
The interesting thing is that the Bayesian computation not 
only yields a specific value of Pbin for each object for any 
subsequent statistical study, but also includes in the poste- 
rior age pdf the possibility of the star being an undetected 
binary. In this way, undetected binaries are less likely to 
introduce unrecognized contamination in the scientific anal- 
ysis of the results (see Section |SJ. 

3.4 Choice of stellar evolution models and 
temperature scale match 

Sets of stellar evolution tracks for late-type dwarfs have been 
produced by many different groups. Some of the most widely 
used are Girardi et al. (2000), Yi et al. (2001), Lejeune 
& Schaerer (2001). The agreement between the predicted 
isochrones from the different groups is generally good on 
or near the main sequence, so that using one set of models 
rather than another doesn't introduce dramatic differences 
in the derived stellar ages. Two robust predictions of stel- 
lar evolution theory, that heavier stars evolve more quickly 
and that stars on the main sequence get brighter with age, 
provide the dominant tendencies. 

There are, however, important residual differences be- 
tween the sets of models, that have a significant influence 
on the age determination. Of particular importance are the 
known difficulties related to the model temperature pre- 
dictions: absolute temperature scale of the models, colour- 
temperature conversions, metallicity dependence of the po- 
sition of the unevolved main sequence. There also are signif- 
icant systematic differences between the model temperature 
predictions and the observed position of well-measured field 
dwarfs (Lebreton et al. 1999; Lebreton 2000; Kotoneva et al. 
2002b). 

As far as the age determination is concerned, the im- 
portant fact is that the models and observations be on 
the same relative temperature scale. The observation of 
nearby unevolved K-dwarf stars with well-known paral- 
laxes and metallicities shows that the actual colour change 
with metallicity is significantly lower than model predictions 
(Kotoneva et al. 2002b). Several explanations have been 
proposed for this mismatch, including problems with the 
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temperature-colour conversions, met allicity- dependent he- 
lium abundances, and heavy-element sedimentation (Lebre- 
ton et al. 1999). None of these effects seems able to account 
for the whole mismatch. Some moderately metal-deficient 
dwarfs in the solar neighbourhood are compared with model 
predictions in Fig. Q to illustrate the amplitude of the mis- 
match between the Girardi et al. (2000) models and the 
observations. Obviously, ages derived with such a large tem- 
perature mismatch will be strongly biased towards terminal 
ages. 

Several schemes can be adopted to ensure that the mod- 
els and observations are coherent, either at the level of the 
colour-temperature conversion, or as empirical temperature 
shifts in the models. 



PART TWO: APPLICATION TO THE E93 
SAMPLE AND THE AGE-METALLICITY 
RELATION 

4 BAYESIAN AGES FOR THE E93 SAMPLE 
AND SOLAR NEIGHBOURHOOD AMR 

In this Section, we apply the Bayesian age calculation to 
a specific case, the landmark E93 study (see Introduction). 
As an illustration and important application of the approach 
developed in the previous sections, we calculate the poste- 
rior age pdf for the objects of the E93 sample, and recon- 
sider their determination of the age-metallicity relation of 
the Galactic disc. 

4.1 Recent data for the E93 sample 

The E93 sample was selected from the large Olsen (1994) 
catalogue of F and G dwarfs in the solar neighbourhood. 
The selection criteria were approximately a range in tem- 
perature, 5600 < T < 6800, and in evolution away from the 
Zero-Age Main Sequence, Mv — Mv,zams > 0.4 mag. The 
objective was to select stars in the subgiant portion of the 
CMD, where the isochrones are more widely spaced and age 
determinations are presumably more accurate. 

The main emphasis of E93 was on providing accurate 
metallicities. They estimated the relative accuracy of their 
metallicities to be 0.05 dex. They derived ages for their ob- 
jects by comparison with Vandenberg (1985) isochrones. The 
adopted ages were that of the isochrone crossing the posi- 
tion of the data in the temperature-luminosity plane. The 
uncertainties on the ages were estimated to be around 0.1 
dex, based on the direct propagation of the observational 
errors. In our notation, E93 have computed age estimates 
from 

t = ^- 1 (T,L,[Fe/H]) 

and evaluated the uncertainties with 

F-\T±<tt,L±ctl, [Fe/H] ± a [Fe/H] ) 

E93 provide evidence for the 0.1 dex value of the age 
uncertainties by displaying data for M67 and showing that, 
indeed, the dispersion of the inferred ages is of the expected 
order, at least in the subgiant portion of the CMD. 

The data for all E93 objects have been significantly im- 
proved in the intervening decade, the only exception being 



the individual metallicities, that were of high relative accu- 
racy in E93. The most noteworthy addition is the availability 
of Hipparcos parallaxes, that allow the distances - hence the 
absolute magnitudes - of the objects to be known with much 
better accuracy than was available to E93. Most stars in the 
sample have distances of less than 50 pc, and correspond- 
ingly uncertainties of less than 5 percent in the Hipparcos 
parallaxes, in contrast with the 14 percent uncertainty as- 
sumed by E93 for the photometric distances. 

The second important addition to the E93 data is that 
of further radial velocity monitoring (GCS), that has re- 
vealed a certain number of spectroscopic binaries. 

Ng & Bertelli (1998) have reconsidered the E93 sam- 
ple with Hipparcos distances, and with the Bertelli et al. 
(1994) stellar evolution models. However, their age deriva- 
tion method is not fundamentally different from E93, and 
they consequently derive a similar age-metallicity plot. It is 
already apparent in their study, though, that the new dis- 
tances considerably reduce the number of stars in the high- 
metallicity, high-age section of the diagram. This could be 
expected, because it is precisely in this region, the terminal- 
age subgiant region, that objects get preferentially scattered 
by the distance errors (see Section r2.2H . It is also worth not- 
ing that many stars are put back on the main-sequence with 
AMv < 0.4 by the new distances, proving the reality of the 
biases associated with the non-Bayesian calculation. 

Another valuable improvement was brought by 
Lachaume et al. (1999), who computed the distribution of 
the likelihood for the age estimation of a sample of 91 local 
field dwarf stars. While this still ignores the prior pdf, it 
does show that the l-cr interval for the derived ages is larger 
than 0.1 dex for many objects in the crucial 3-10 Gyr range. 

Finally, the Hipparcos data have also allowed the dis- 
covery of large temperature shifts between models and data 
(see Section I3.4H . that were not corrected by E93 or Ng & 
Bertelli (1998) and also affect the age determinations. 

4.2 Bayesian ages for the E93 sample 

Posterior age pdf's were computed for the objects in the 
E93 sample using the method of Section [2.31 The age prior 
is taken as flat, with a cutoff at t ma x = 15 Gyr, the mass 
prior as a power function of slope —2.35. The metallicity 
prior is a Gaussian centered on [Fe/H] = —0.15 with a dis- 
persion of 0.19 above [Fe/H] = —0.53, and a constant below 
[Fe/H] = -0.53 with a cutoff at [Fe/H] = -1.0 (see Fig.0. 
This function is a visual approximation of the metallicity 
distribution of the whole GCS, and is also compatible with 
the volume-limited distribution for the solar neighbourhood 
derived by j0rgensen (2000) . The stellar population synthe- 
sis code IAC-star (Aparicio & Gallart 2004) was used for the 
Monte-Carlo estimation of p(m, t, z) and the T transforma- 
tion. The E93 metallicities were put on the scale of Santos et 
al. (2002) by a shift of +0.12 dex. The temperature scale in 
GCS (as of 2003) was used, adjusted by a shift of +0.006 to 
obtain a satisfactory match between the Padua isochrones 
and the GCS data in the theoretical plane 4 . 

Gaussian uncertainties were assumed on [Fe/H], logT 

4 The discussion of this mismatch is beyond the scope of this 
article, but it is certainly worth enquiring into and is a strong 
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Figure 8. Posterior age pdf, between and 15 Gyr, for represen- 
tative late-type examples of the updated E93 data. The dotted 
line indicates a Gaussian of 0.1 dex dispersion around the E93 
age value, for comparison. 
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and My. Several different sets of values were used for the 
standard dispersions of the uncertainties. See Section for 
a confrontation of different cases. For our "standard" com- 
putation, we used <Ti og T = 0.009 gmv = 0.15 and <J[Fe/H] = 
0.075. We arrive at these values by using the uncertainties 
proposed by Ng & Bertelli for their revision of the E93 sam- 
ple (a[Fe/H] = 0.05, criogT = 0.006), an My uncertainty of 
0.10 mag (5 percent uncertainty on the distance), and al- 
lowing for the possible presence of systematic errors of sim- 
ilar amplitude by multiplying all values by a factor 1.5. It 
is important to remember that differences such as a zero- 
point shift in the metallicity scale or temperature scale have 
a strongly non-linear effect on the age determination (i.e. 
shifting the temperature scale does not produce a single shift 
of all age values but very different shifts depending on the 
position in the CMD). Systematic zero-point differences of 
0.10 dex in [Fe/H] and 0.005 in logT are common between 
different scales. Indeed, as mentioned above, shifts of such 
magnitude were found necessary to match the theoretical 
isochrones with the observational data. The use of [Fe/H] 
itself as a surrogate for the total heavy-element abundance 
used in the theoretical models is also subject to uncertain- 
ties, given the observed variations in the abundance ratios 
from star to star. In the Bayesian treatment, these sources of 
error have to be integrated into the assumed observational 
uncertainties. It is crucial to use the real difference between 
the data and the evolution tracks to estimate the likelihood, 
not the relative difference. 

Fig. |H1 gives posterior age pdf for a few representative 
objects in the sample, compared with a 25 percent (~ 0.1 

limitation on the accuracy of the isochrone age estimates (see 
review by Lebreton 2000 and Section |3~H. 



dex) dispersion Gaussian centered on the E93 age estimate. 
The results show that the shape and width of the posterior 
age pdf can vary a lot from one star to the next. Both the 
central value and the general shape of the posterior probabil- 
ity distributions of the age are often very different from that 
obtained by E93 with the T~ x approach. In some cases, a 
Gaussian is a valid approximation, but many stars are sub- 
ject to wider and very asymmetrical probability distribu- 
tions. Some have posterior pdf spanning most of the allowed 
0-15 Gyr range, so that the chosen age value is sensitively 
dependent on the assumed prior. In these cases, the derived 
age is not well determined (e.g. HD 115617, HD 177565, 
HD 98553 in the Figure). 



4.3 Age-metallicity relation 

Fig. [§Jj plots the Bayesian ages, with the updated data, in 
the age-metallicity plane, using the median of the posterior 
pdf as an age estimator. The binaries identified in GCS are 
indicated as crosses. 

Given the sensitivity of the age determination to the 
input parameters, it is important to use all the available 
information to identify possible outliers. The metallicities 
of E93 were confronted with the photometric metallicities 
derived in GCS, the Hipparcos parallax distances with the 
distances derived from the photometric calibrations in GCS. 
The following conditions were required for inclusion in the 
final sample: 

[Fe/H] E93 - [Fe/H] cos |< 0.2 
| rg93 ~ rGCS |< 0.2 

TE93 

where r is the distance. HD 67228, 112164, 199960 and 
207978 were identified visually as outliers on a comparison of 
temperatures between E93 and GCS, and of colour between 
the Stromgren (b — y) and Hipparcos (B — V). 

The objects singled out by the conditions above are in- 
dicated by open symbols in Fig. [^Jd. The data discrepancy 
indicates that they can be peculiar in some way, or that 
some of their measurements may be statistical outliers, so 
that the age determination may be affected. 

Fig. repeats the original E93 age-metallicity plot, 
for comparison. Two regions particularly important in giv- 
ing an impression of high dispersion in the age-metallicity 
relation, and particularly prone to contamination by skewed 
probability distributions of age with large error bars (see 
Section 12.31 . are indicated with dashed lines (Similar re- 
gions in the age-metallicity plot were previously used by 
Rocha-Pinto et al. 2000 to show that ages from chromo- 
spheric activity did not produce a high intrinsic dispersion 
in the AMR). The presence of many stars in these two re- 
gions in E93 contributed strongly to the conclusion of a very 
wide metallicity dispersion at all intermediate ages. 



4.4 Discussion 

Fig. |U] offers a spectacular confirmation of the effect of bi- 
ases and of the potential perils of replacing the complete 
posterior pdf by a single maximum-likelihood point in the 
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Figure 9. a The original age-metallicity plot from E93. b the age-metallicity plot for the E93 sample with new data: luminosities from 
Hipparcos distances, temperatures from GCS, ages from the present study. Symbols as follows: crosses— binary in the GCS catalogue, 
open circZes-photometric distance from E93 discrepant with Hipparcos distances by more than 20 percent, metallicity from E93 and from 
GCS by more than 0.2 dex, or temperature/colour discrepant between E93 and GCS. c The age-metallicity relations used in the Bayesian 
models of Section |3 d Simulated age-metallicity plot, resulting from an age-metallicity relation of total range A[Fe/H] = 0.4 dex, and 
observational uncertainties o~yp e /u\ = 0.05, cry = 75 K, om v = 0.15 mag, systematic shifts 0.12 dex in metallicity and +0.005 in logT. 
In the two top plots, the dashed lines outline two regions that are important for the evaluation of the dispersion of the age-metallicity 
relation, and where contamination from biased low-accuracy ages is expected to be significant (see text). "Region I" near solar metallicity, 
age> 5 Gyr, "Region II" metal-poor, age< 6 Gyr. 



age-metallicity plot. Indeed, the new data give a strong in- 
dication that there were many objects in the high-age, high- 
metallicity part of the original E93 AMR plot that had been 
displaced in the subgiant zone by high observational uncer- 
tainties and binarity ("terminal age bias", in our terminol- 
ogy). The upper dashed zone in the AMR becomes practi- 
cally empty with the new data. All four points remaining in 
it are compatible with being one-sigma outliers from younger 
ages. 



Fig. 1101 displays the age-metallicity diagram for our 
selected objects, plotting the half-maximum and tenth- 
maximum intervals of the posterior age pdf for a few repre- 
sentative objects to give a feeling for the shape of the age 
pdf. Contrarily to the original E93 AMR, the updated AMR 
diagram outlines a definite monotonic relationship between 
age and metallicity for intermediate ages. Part of the scatter 
in the original relation is removed. A simple linear fit gives 
[Fe/H] = -0.056 age + 0.011, with a dispersion of 0.18 dex in 
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Figure 10. Median Bayesian age in the age-metallicity plot for 
our selection of the E93 sample. For a few objects, the full-width- 
half-maximum and the full width at tenth maximum of the pos- 
terior age pdf are indicated. The solid and dashed lines indicate 
respectively the mean and envelope AMR used for Fig. 1111 
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Figure 11. Mass-metallicity relation for our sample. Symbols 
as in Fig. |[J>. The thick lines indicate the lower envelope of the 
mass-metallicity relation for a low-dispersion AMR (solid line in 
Fig. ED with a °- 24 dex 

range in [Fe/H]) and high dispersion 
AMR (dashed line). The oblique dotted and solid lines indicate 
the zones affected by the temperature selection biases. 
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metallicity, to be compared with 0.24 dex in E93. This value 
still includes the scatter introduced by the age uncertainties 
and by the fact that the actual AMR may not be linear 5 . 
Therefore, 0.18 dex can be considered a strict upper limit 
for the "cosmic scatter" in the AMR. 

Using the posterior age pdf that we obtain for the E93 
objects, Monte Carlo simulations of the whole procedure 
were carried out to determine what dispersion is expected 
from the observational uncertainties alone, assuming a dis- 
persionless AMR. We find that a dispersion at fixed age 
a [Fe/H] —0.10 is introduced around a dispersionless AMR 
by the observational uncertainties. By quadratically sub- 
tracting this dispersion from the value of 0.18 found in the 
observed AMR, we estimate the remaining intrinsic scatter, 
the so-called "cosmic scatter", to be at most 0.15 dex. 

Therefore, with the improved data and detailed treat- 
ment of the age probability distribution, the E93 sample no 
longer indicates a very high scatter in the metallicity at a 
given age, or a near-absence of AMR in the intermediate 
age range. While still clearly distinct from a dispersionless 
AMR, the data indicate a rather well-defined growth of mean 
metallicity with time, with an intrinsic dispersion of the or- 
der of 0.15 at most. 

A similar conclusion was already obtained by Rocha- 
Pinto et al. (2000) on the basis of chromospheric ages for 
solar-neighbourhood dwarfs. Their result was subsequently 
criticized on the basis of the observed disagreement between 
isochrone ages and chromospheric ages (e.g. Feltzing et al. 

5 as well as the increased scatter introduced by the deliberate 
selection by E93 of objects of different metallicities and masses. 
See their discussion on this point. 



2001). However, the present study brings support to the va- 
lidity of the indications from chromospheric ages and shows 
how the apparent disagreement could arise from strong sys- 
tematic errors on the isochrone ages. 

In the following two paragraphs, we examine other ways 
to look at the data that can add more indications on the 
reality of the high intrinsic scatter, by considering the two 
regions, "Region I" and "Region II", that we defined in the 
age-metallicity diagram on Fig. [!|] 



4.5 "Region I": solar metallicity, age > 5 Gyr 

The majority of the objects placed in or near "Region 
I" of the CMD have been excluded by our quality crite- 
ria. There are five detected binaries. The other objects are 
stars with M ~ O.9M0, for which the Hipparcos distance 
is much smaller than the distance obtained by E93 from 
the photometry (more than 20 percent difference, these ob- 
jects are HD 76151, 86728, 108309, 115617, 127334, 177565 
and 217014). Because the main-sequence is narrow for M ~ 
O.9M0, these smaller distances, implying fainter My, are 
sufficient to bring these stars from the subgiant zone (with 
apparently well determined ages in the 5-10 Gyr range) back 
on the main sequence. These objects were therefore outliers 
of the Stromgren My calibration that had been preferen- 
tially selected by the AM V > 0.4 criteria of E93. The Hip- 
parcos distances put them back on the main-sequence where, 
for such low masses, no accurate estimate of the age can be 
given. 

This illustrates a side-effect of the AMv > 0.4 selec- 
tion criteria used by E93. Intended to select only stars in 
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the region of the CMD were isochrones are well-spaced, it 
also samples the region of the CMD where the difference 
between likelihood and posterior is larger, and the propor- 
tion of undetected binaries is higher. The E93 study picked 
up f89 stars out of more than 19'000, and in the process 
it favors the 2-3 a outliers from the main sequence. Simple 
statistical considerations show that the number of main se- 
quence contamination in the AMv > 0.4 zone must be large. 
The Bayesian approach automatically includes this selection 
effect in the posterior pdf, by taking into account the prior 
distribution prob(t), which is heavily weighted towards the 
main-sequence, and the bias caused by the selection criteria 
is absent in our updated sample, thanks to the fact that the 
Hipparcos parallaxes are used to re-compute values of My 
that are independent of the original photometric Mv used 
in the sample selection. 

Note that "Region I" is even more populated in the 
higher-uncertainty samples of Ibukiyama & Arimoto (2000) 
and Feltzing et al. (2001). The presence of many objects 
in this region of the age-metallicity diagram does not give 
any indication of the real existence of such high-metallicity, 
intermediate-age stars. On the contrary, the E93 sample 
tends to indicate that this region is actually empty, because 
there is not a single bone fide subgiant with an age above 7 
Gyr and a metallicity near solar. 

Interestingly, the Sun, at t — 4.5 Gyr and [Fe/H]=0, 
seems to be near the upper edge of the compatible age- 
metallicity distribution of the sample (Wielen et al. 1996). 
This may be related to the observed statistical overabun- 
dance of planet host stars (Gonzalez 1998; Santos, Israelian 
& Mayor 2000). 

4.6 "Region II": metal-poor, age < 6 Gyr 

Let us now consider the second dashed region, "Region II", 
in the low-age (t < 6 Gyr), low-metallicity ([Fe/H] < 0.5) 
part of the diagram. 

The age-metallicity diagram shows a few data points in 
this region that may or may not have been scattered there by 
the uncertainties on the age determination (for none of these 
objects is the whole posterior age pdf entirely contained in 
" Region II" ) . Fortunately, there is another way to determine 
whether this region of the AMR is really occupied. 

The age estimation from isochrones also provide an es- 
timation of mass. The mass can be derived from theoretical 
tracks in a more reliable way that the age, because the mass 
changes relatively slowly with the observed parameters. A 
fundamental feature of stellar evolution is the fact that the 
duration of the main-sequence phase is a very sharp func- 
tion of mass. This sharp dependence implies the following: 
only stars below a certain mass can reach ages above a given 
age. E.g., only stars with masses below 1.2 Mq can reach 
ages above 5 Gyr, and masses below 1 Mq ages above 10 
Gyr. This relation between mass and maximum age implies 
that the lower envelope of the mass-metallicity relation will 
depend on the AMR and its dispersion. As we move to lower 
masses, higher ages become available and the metallicities 
reached at these ages begin to appear in the age-mass rela- 
tion. 

Therefore, if there really are low-age (3-6 Gyr), low- 
metallicity ([Fe/H] < —0.5) stars -objects within Region II- 
then we expect such stars to be of all masses able to reach 



at least 3 Gyr of age, M <~ 1.3M0. On the other hand, 
if Region II is not actually occupied, and the points in the 
age-metallicity diagram are scattered into it by the uncer- 
tainties from higher ages, then this should be revealed by the 
absence of M ~ 1.1 — 1.3M0 stars with lower metallicity. 
The region in the mass-metallicity diagram from which low- 
metallicity stars start to be found gives a definite indication 
of the minimal age of such stars. 

Fig. llll shows the mass-metallicity plot for the E93 sam- 
ple with the updated data. E93 do not give mass estimates 
for their stars. Masses for Fig. llll were computed by us as a 
by-product of the age computation. The upper envelope of 
the relation between mass and maximum age was adjusted 
on the GCS data as log tn m — 1.09 — 4.35 log(mass). Via this 
relation, the lower envelope of the AMR can be converted 
into a lower envelope in the mass-metallicity plot. Fig. 1111 
gives the predicted lower envelopes of the mass-metallicity 
relation with the two AMR plotted in Fig. I1UI The first is a 
A = 0.24 dex relation defined by the solid line on Fig. 1101 
and the second (dashed line) is the lower envelope of an 
AMR with very high intrinsic scatter, of the type inferred 
from the original E93 interpretation. The crucial difference 
between the two AMR is that one predicts the real exis- 
tence of objects in Region II while the other does not. The 
zones affected by the selection biases of E93 are also indi- 
cated in Fig. 1111 Selection becomes increasingly unlikely as 
one moves from the dotted to the solid limits. 

The result definitely leans towards the absence of low- 
metallicity, low-age stars. Although the selection biases af- 
fect the region of interest, the observed envelope clearly 
favours the low-dispersion AMR model. The most significant 
evidence is the lack of metal-poor, 1.1-1.3 Mq stars. This 
absence is best explained by the fact that 1.1-1.3 Mq stars, 
with maximum ages in the 5-8 Gyr range, are too young to 
have experienced metallicities as low as [Fe/H] = —0.6. Con- 
sequently, the objects observed in "Region II" in the AMR 
plot are lower-mass objects, scattered from higher ages by 
the observational error. A fact fully compatible with their 
age probability pdf. 

This is a solid indication that the lack of definite AMR 
in the E93 sample is only apparent, independently of the 
discussion of the age probability pdf. It should be confirmed 
with samples with wider selection criteria and more objects, 
for instance by obtaining precise spectroscopic metallicities 
for a sample of 1.1-1.3 Mq stars with a higher temperature 
cut. 



5 THE DISPERSION IN THE AMR FROM 
BAYESIAN MODEL COMPARISON 

As the present study has indicated, the derivation of the 
AMR and its intrinsic dispersion from the age-metallicity 
plot is made difficult by the shape of the age uncertain- 
ties. Replacing prior-dependent age estimates with large 
and asymmetrical uncertainty distributions by single points 
makes direct "eyeball" analysis unreliable, and does not per- 
mit the collection of the metallicity data into separate age 
bins. 

However, this does not imply that the data cannot be 
used to study the AMR. The posterior age pdf contains all 
the information available on the ages, and there are other 
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ways to analyse the data and constrain the dispersion of 
the AMR, such as the mass-metallicity relation used in Sec- 
tion ED 

The Bayesian framework also provides tools for the 
comparison of different models. Let us call M the model as- 
suming a particular AMR with an intrinsic range A[Fe/H]. 
The model consists of 

• a mean age-metallicity relation: [Fe/H] = f(t) with an 
intrinsic range A[Fe/H], 

• stellar evolution models L = Th(m,t,z) and T = 
T T {m,t,z), 

• assumptions about the distribution of the observa- 
tional uncertainties (for instance Gaussian on log T, Mv and 
[Fe/H]). 

What one wants to compute is prob(7M | D), the prob- 
ability of the model M being true, given all the data D. In 
practice, one is not interested in the normalized probabilities 
but wishes to compare the probabilities of two models. 

Using Bayes' theorem, 

prob(A<lI))^prob(^) Pi ;^) 

the ratio of the probabilities for two models Mi and M2 is: 
prob(Xi 1 D) _ prob(Xi) prob(£> | Mi) 
prob(A / (2 I D) prob(Al2) prob(D | M2) 

where the unknown normalization prob(D) vanishes in the 
ratio. 

If the data points are independent, then the global term, 
prob(D I M) can be broken down into a product of individ- 
ual probabilities for the individual data points di. 



prob(D \M) = TTprob(di|.M) 



As in Section [2.31 we marginalize over the mass and metal- 
licity, but now also over the age: 



prob(di I M) 



pTob(di,m, t,z\M) dm dtdz 



Using the probability product rule: 
prob(di I M) = 

proh(di\M, m,t, z) prob(m, t,z\M) dm dtdz 

The first term in the integral is the likelihood, 

pmb(di\M, m, t, z) = £(T i5 U, [Fe/H]i , m, t, z) 

and the second term is the prior in (m, t, z) according to 
model M, which we note p(m,t,z). Then, 

pmb(M I D) = prob(M) 



n 



p(m,t, z)C(Ti,Li, [Fe/H];, m, t, z) dm dtdz 



As in Section 12.31 the integral over the (m,t,z) space 
can be evaluated by a Monte Carlo method: 

N n 

prob(M I D) =pTob(M)Yl^2^(T i ,L i ,[Fe/B] i ,m j ,t h z j ) 

i=i j=i 



where the (mj,tj , Zj) triplets are n draws according to the 
probability distribution function p(m,t,z). 

The final step is to compute prob(TM), the prior prob- 
ability of the model. As shown for instance by Sivia (1996), 
in the case of varying the A[Fe/H] parameter, prob(A / () is 
simply inversely proportional to A [Fe/H]. In the case of 
a dispersionless relation with A[Fe/H] = 0, prob(7V() ~ 
(s/Qsko I i \/N)~ 1 , where a is the observational uncertainty 
and N the number of data points. 

The Bayesian posterior probability prob(Alj-D) was 
computed for a set of models with different assumptions 
on the AMR, using the same parameters as in Section [4.21 
with 10 6 draws on the Monte Carlo integration. The results 
are displayed in Table Q The probabilities are given rela- 
tive to Model 1. Model 1 assumes no relation between age 
and metallicity (flat AMR) , with a total metallicity range of 
A [Fe/H] =0.80 dex. Model 2 is a dispersionless AMR, linear 
in z, with [Fe/H] = -0.3 at t = 2.5 Gyr and [Fe/H] = -0.8 
at t — 13 Gyr. Model 3 assumes the same AMR, but with 
a flat range of A[Fe/H]=0.25 in the metallicities at a given 
age (standard deviation <7[F e /H] ~ 0.07 dex). Model 4 is 
the same AMR, with A[Fe/H]=0.40 {cr [Fe/H] ~ 0.12 dex). 
Model 5 is an AMR of inverse slope, for comparison. To con- 
centrate on intermediate-age, thin disc objects - the objects 
for which E93 indicate a high scatter - we use the selection 
criteria mass < 1.2M© (removing very young objects) and 
Rm > 7 kpc (removing thick disc objects), where R m is the 
mean radius of the galactic orbit computed by E93. 

Table gives the logarithm of the resulting probabili- 
ties, log 10 P(M j D). The probabilities were computed as- 
suming G[Fe/H] — 0.075, oiogT — 0.009, and om v =0.15. Two 
objects have posterior probabilities 10 -3 below the maxi- 
mum for model 3. Such outliers have an excessive weight 
in the Bayesian model comparison because Gaussian uncer- 
tainty distributions are assumed in the likelihood. In the 
real world, unaccounted causes such as binarity or misiden- 
tification leads to uncertainty distributions that have flatter 
wings than Gaussians. The calculations were therefore also 
done without these two objects (column "clipped" of Ta- 
ble [TJ. 

The total probability was also computed allowing for 
the possible presence of undetected binaries. As an upper 
limit to the possible contamination, a proportion of 5 per- 
cent of equal-mass binaries was assumed (fifth column of Ta- 
bic 0. Finally, the calculations were also done with another 
set of assumed observational uncertainties, (J\p a im = 0.10, 
o~io g T = 0.01, and cxm v =0.10. 

Table shows that the low-, medium- and high- 
dispersion models are within one or two decades of each 
other in total probability. The Bayesian computation shows 
that the model with a range of 0.4 dex in metallicity at a 
given age, implying a standard dispersion of 0.12 dex around 
a single AMR, is as favoured by the data as the high-intrinsic 
scatter AMR within reasonable variations in the assump- 
tions. Therefore, the data do not clearly favour a high dis- 
persion model over a low dispersion model of the AMR when 
the whole age pdf's are taken into account. 
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Model 


A[Fc/H] 




lnprob(Model) 














raw 


clipped 


with 5% 


larger 










2 objects 


binaries 


errors 


Model 1 


0.80 


0.24 


0.00 


0.00 


0.00 


0.00 


Model 2 








-18.30 


-14.75 


-11.19 


-9.27 


Model 3 


0.25 


0.07 


-2.94 


+0.32 


-2.75 


-0.78 


Model 4 


0.40 


0.12 


-0.63 


+2.24 


-0.53 


+1.06 


Model 5 








-46.36 


-40.45 


-41.93 


-29.20 



Table 1. Total logarithmic posterior probability log 10 prob(.M|D), for different models M of the age-metallicity relation, given the data 
D about the E93 sample. Model 1: no significant AMR and high intrinsic scatter; model 2: dispersionless linear AMR; models 3 and 4: 
linear AMR with low and medium intrinsic dispersion; model 5: inverse dispersionless AMR. "Raw": all data with mass < 1.2Mq and 
fi m > 7. "Clipped": without HD 84737 and HD 88986. "with 5% binaries": assuming 5 percent undetected equal- mass binaries. "Other 
errors": assuming cry Fe / H ^ = 0.10, cr logT = 0.01, and <tm v =0.10 



6 CONCLUSION 

6.1 Conclusions about E93 and the Galactic AMR 

The conclusion of our reappraisal of the implication of the 
E93 sample for the age-metallicity relation in the Galac- 
tic disc is that the data provide no solid evidence for the 
presence of a a ~ 0.24 metallicity range at fixed age (or 
a A ~ 0.7 range), as usually stated. On the contrary, 
new data and a Bayesian age determination put an upper 
limit of a — 0.15 dex on the intrinsic scatter of the AMR. 
An extended Bayesian probability analysis shows that the 
age probability distributions are much wider than realized, 
and that visual interpretation of an age-metallicity plot like 
Fig. |5| is likely to be misleading. The age uncertainties are 
also too large for an age binning of the data to be made with 
any confidence. 

New Hipparcos parallax and Coravel radial velocity 
data on the same sample confirm the doubts introduced 
by the Bayesian approach, and show that many outliers on 
the age-metallicity diagram are indeed detected binaries or 
stars with either discrepant distance estimates or discrepant 
metallicity estimates. Many ages are also put nearer to the 
mean AMR by the temperature adjustments found neces- 
sary between the stellar evolution models and the observa- 
tions. This, together with a Bayesian model-testing analysis, 
point to a rather well-defined AMR with a smaller metallic- 
ity gradient at fixed age, with a standard deviation of the 
order of 0.15 dex or lower, or a total range of <~0.4 dex at 
a given age. 

This lower range is confirmed by examining the be- 
haviour of the data in two specific zones of the mass- 
metallicity plot, showing the absence of young, metal-poor 
stars in the [Fe/H] ~ —0.5, t < 6 Gyr zone, and of old, 
solar-metallicity stars in the [Fe/H] ~ 0.0, t > 5 zone. 

The implication is that there is no mandatory need at 
this point for galactic models to reproduce a very large scat- 
ter of metallicity in the ISM at a given time and galactocen- 
tric radius for the Galactic disc. It restores the coherence 
with the numerous other indications of a low present-day 
dispersion in the abundance of the gas (e.g. ISM, cepheids, 
HII regions, see Introduction for references). The remain- 
ing dispersion is still quite large, and shows that a simple, 
single- AMR model is not sufficient. But it lies within the val- 
ues observed in other star-forming galaxies, and indicated by 
Galactic open clusters. It is also within the scale of what rea- 
sonable chemical inhomogeneities and radial orbital mixing 



can achieve without the need to invoke long-lived extreme 
inhomogeneities or infall in the past. 

6.2 General Conclusions and recommendations 

Looking beyond the E93 sample to future studies of the 
chemical and dynamical history of the Galactic disc, we now 
consider some implications of our results. 

Metallicities with an internal uncertainty of ~ 0.05 
dex, as in E93, with Hipparcos distances (<r M ~ 0.1 mag 
at 50 pc) are still about as accurate as can presently be 
achieved in terms of uncertainties of the observables. Colour- 
temperature transformations, bolometric corrections and 
model temperature errors are also sources of uncertainties 
that are proving difficult to reduce below the level of 0.01 
dex on logT and 0.10 mag on My. 

With these kinds of accuracies, the posterior age pdf's 
are often wide and asymmetrical, especially for later-type 
stars - the ones most useful in the study of the history of 
the Galaxy. In that case, ages computed with the standard 
method can be strongly biased, and replacing the full prob- 
ability distribution by a single central value can lead to mis- 
leading impressions. 

For large samples, uncertainties of about 0.10 dex or 
larger in [Fe/H] are more typical (see for instance Ibukiyama 
& Arimoto 2002, Feltzing et al. 2001), implying even wider 
age pdf's. In this case, it should be realized that when the 
probability distributions for the ages are much wider than 
the dispersion of the points themselves, adding more points 
only provides a better definition of these probability distri- 
butions themselves, without actually adding much informa- 
tion on the underlying age distribution. This regime dom- 
inated by systematic effects is clearly apparent in fig. 5 of 
Ibukiyama & Arimoto (2002) and fig. 10 of Feltzing et al. 
(2001) as the "wave-shape" in the age-metallicity diagram. 
Not only is a mean metallicity decreasing with time near 5- 
10 Gyr difficult to understand in terms of galactic evolution, 
but it is also exactly the kind of shape that we expect with 
a bias towards the terminal age (the "Region I" in Fig. 0. 
Such a revealing shape is also apparent in the AMR plot 
of GCS. Thus, as correctly reckoned by E93, a small, low- 
error sample is preferable in this regime to a large, high-error 
sample. 

We also note that selections of subsamples by impos- 
ing a limit on the relative age error, e.g. <r ago /age < 0.5 as 
in Feltzing et al. (2001), should be avoided, because they 
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strongly reinforce the "terminal age bias" (see Section \l.2l . 
Because age tT uc is not accessible, the selection is in fact 
o"a g e/age obs < limit, which favours ages near the upper limit 
of their error bar with a low apparent o" ago . For instance, in 
our Fig. 2] such a selection would pick up only the most 
strongly biased ages with age obs ~ 10 Gyr. 

The following suggestions are proposed for the compu- 
tation of isochrone ages and the study of the history of the 
Galactic disc: 

- For late-type stars, the posterior age pdf can be com- 
puted rather than the "nearest isochrone" age which can be 
strongly biased. 

- Smaller samples with lower uncertainties should be pre- 
ferred to large samples with higher uncertainties. 

- The full age pdf should replace Gaussian approxima- 
tions to examine the compatibility of the data with a given 
hypothesis. The age pdf often has wide and flat wings. 
With such strongly ungaussian distributions, mathematical 
hypothesis-testing can be sounder than eyeball analysis. 

- The mass vs. age plot can be used as a diagnostic. If 
the derived ages cluster towards the end-of-main-sequence 
lifetime, they are probably subject to a strong systematic 
bias ("terminal age bias"). 

- Relative error selection criteria (<r agc /age < limit) 
should be avoided to form subsamples with better deter- 
mined ages. a- agc < age ms , where age MS is the main sequence 
lifetime at the star's mass, is a good alternative. 

Our results also tend to rehabilitate the method of age 
determination from chromospheric activity. Discrepancy be- 
tween chromospheric and isochrone ages had led to some sus- 
picion of unrecognized uncertainties in the former method 
(see Introduction). However, according to our study, a large 
part of the mismatch can be attributed to the systematic 
effects affecting direct isochrone ages. The age-metallicity 
relation using chromospheric ages shows a lower scatter, 
adding further confidence in the reliability of chromospheric 
ages. A detailed Bayesian comparison of isochrone and chro- 
mospheric ages would be useful in this context. 

Finally, our study suggests that given the high sensitiv- 
ity of the age determination to observational uncertainties 
and in particular to statistical outliers, it can be very useful 
to combine independent determinations of the input quan- 
tities - temperature, luminosity, metallicity - in order to at- 
tempt to identify the objects which may be such outliers. 
The strongest biasing effects are highly non-linear and can 
be much reduced by removing such objects. 

An interactive code to compute Bayesian age estimates 
for Galactic dwarfs is available at the following website: 

\protect\vrule widthOpt\protect\href {http : / / obswww . unige 
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